首页> 外文OA文献 >Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos
【2h】

Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

机译:未标记视频中人脸识别的无监督域自适应

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Despite rapid advances in face recognition, there remains a clear gap betweenthe performance of still image-based face recognition and video-based facerecognition, due to the vast difference in visual quality between the domainsand the difficulty of curating diverse large-scale video datasets. This paperaddresses both of those challenges, through an image to video feature-leveldomain adaptation approach, to learn discriminative video framerepresentations. The framework utilizes large-scale unlabeled video data toreduce the gap between different domains while transferring discriminativeknowledge from large-scale labeled still images. Given a face recognitionnetwork that is pretrained in the image domain, the adaptation is achieved by(i) distilling knowledge from the network to a video adaptation network throughfeature matching, (ii) performing feature restoration through synthetic dataaugmentation and (iii) learning a domain-invariant feature through a domainadversarial discriminator. We further improve performance through adiscriminator-guided feature fusion that boosts high-quality frames whileeliminating those degraded by video domain-specific factors. Experiments on theYouTube Faces and IJB-A datasets demonstrate that each module contributes toour feature-level domain adaptation framework and substantially improves videoface recognition performance to achieve state-of-the-art accuracy. Wedemonstrate qualitatively that the network learns to suppress diverse artifactsin videos such as pose, illumination or occlusion without being explicitlytrained for them.
机译:尽管人脸识别技术取得了飞速发展,但由于域之间视觉质量的巨大差异以及管理各种大型视频数据集的困难,基于静态图像的人脸识别与基于视频的人脸识别之间仍然存在明显的差距。本文通过图像到视频特征级域自适应方法解决了这两个挑战,以学习区分性视频帧表示。该框架利用大规模的未标记视频数据来缩小不同域之间的距离,同时从大规模的标记静止图像中转移判别性知识。给定在图像域中进行了预训练的人脸识别网络,则可以通过(i)通过特征匹配将知识从网络中提取到视频适应网络,(ii)通过合成数据增强来执行特征恢复,以及(iii)学习域,从而实现自适应。通过域对抗性判别器实现不变特征我们通过区分器引导的特征融合进一步提高性能,该融合可以提高高质量帧,同时消除因视频域特定因素而降低的帧。在YouTube Faces和IJB-A数据集上进行的实验表明,每个模块都有助于我们实现功能级的域自适应框架,并大大提高了视频人脸识别性能,从而达到了最新的准确性。定性地证明网络学会抑制视频中的各种伪像,例如姿势,照明或遮挡,而无需对其进行明确训练。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号